why does b'(and sometimes b' ') show up when I split some HTML source[Python]
Posted
by
Oliver
on Stack Overflow
See other posts from Stack Overflow
or by Oliver
Published on 2011-11-12T04:05:35Z
Indexed on
2011/11/12
9:51 UTC
Read the original article
Hit count: 358
I'm fairly new to Python and programming in general. I have done a few tutorials and am about 2/3 through a pretty good book. That being said I've been trying to get more comfortable with Python and proggramming by just trying things in the std lib out.
that being said I have recently run into a wierd quirk that I'm sure is the result of my own incorrect or un-"pythonic" use of the urllib module(with Python 3.2.2)
import urllib.request
HTML_source = urllib.request.urlopen(www.somelink.com).read()
print(HTML_source)
when this bit is run through the active interpreter it returns the HTML source of somelink, however it prefixes it with b' for example
b'<HTML>\r\n<HEAD> (etc). . . .
if I split the string into a list by whitespace it prefixes every item with the b'
I'm not really trying to accomplish something specific just trying to familiarize myself with the std lib. I would like to know why this b' is getting prefixed
also bonus -- Is there a better way to get HTML source WITHOUT using a third party module. I know all that jazz about not reinventing the wheel and what not but I'm trying to learn by "building my own tools"
Thanks in Advance!
© Stack Overflow or respective owner